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ABSTRACT 

In order to maximize the sensitivity of pulsar timing arrays to a stochastic gravitational 
wave background, we present computational techniques to optimize observing sched- 
ules. The techniques are applicable to both single and multi-telescope experiments. 
The observing schedule is optimized for each telescope by adjusting the observing time 
allocated to each pulsar while keeping the total amount of observing time constant. 
The optimized schedule depends on the timing noise characteristics of each individual 
pulsar as well as the performance of instrumentation. Several examples are given to 
illustrate the effects of different types of noise. A method to select the most suitable 
pulsars to be included in a pulsar timing array project is also presented. 
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Ivan Haasteren et al.l [20l3) with a su b-project, the Large 
European Array for Puls ars (LEAP, iKramer fc Stapperj 
2010; Fc rdman et al.ll2010l '). combining data from the Lovell 
telescope, the Westerbork Synthesis Radio Telescope, the 
EfTelsberg 100-m Radio Telescope, the Nangay Decimet- 
ric Radio Telescope, and the Sardinia Radio Telescopj^, 
ii) th e Parkes Pu lsar T iming Array (PPTA ; iManchested 
,200^ iHobbs et al.i i2009l : IVerbiest et al] |2010| '1 using obser- 
vations with the Parkes radio telescope augmented by pub- 
lic domain observations from the Arecibo Observatory, iii) 
the North-American Nanoh ertz Observatory for Gravita- 
tional waves (NANOGrav, Ijenet et al. 1 120091 ') using data 
from the Green Bank Telescope and the Arecibo Obser- 
vato ry, iv) Kalya zin Radio Astronomical Observatory tim- 
ing (|Rodinl [2OIII ). Besides these on-going projects, inter- 
national cooperative efforts, e.g. the Inter national Pulsar 
Timing Array (IPTA. iHobbs fc et al.ll2010l ') or future tele- 
scopes with better sensitivity, e.g. the Fiv e- hundred- meter 
Aperture Spheric al Radio Telescope fFAST. lNaii et al.ll2006l : 
'Smits ct al.' '2009') and the Square Kilometre Array (SKA, 
Kramer & StaoDcrs 2010; Smits ct al. 2009), are planned to 
join the PTA projects to increase the chances of detecting 
GWs. 

Operational questions arise naturally from such PTA 
campaigns, e.g. how should the observing schedule be ar- 
ranged to maximize our opportunity to detect the GW sig- 



1 INTRODUCTION 

Millisecond pulsars (MSPs) are stable celestial clocks, so 
that the timing residuals, the differences between the ob- 
served and the predicted time of arrival (TOA) of their 
pulses, are usually minute compared to the total length of 
the data span. A stochastic gravitational wave (GW) back- 
ground leaves angular dependent correlations in the timing 
residu als of w idely separated pulsars (for general relativ- 
ity see iHellings fc Dow ns 1983,, for alternative gravity theo- 
ries see I Lee et al I l2008l .( 2010), i.e. the correlation coefficient 
between timing residuals of a pulsar pair is a function of 
the angular distance between the two pulsars. Such a spa- 
tial correlation in pulsar timing signals makes it possible 
to directly detect GW u sing pulsar timing arr ays (PTAs; 
iHellings fc Downs! 1 19831: iFoster fc Backed [l990l ). Previous 



analvses (jjenet et al.ll2005l ) have calculated PTA sensitivity 
to a stochastic GW background generated by super mas- 
sive blackholc (SMBH ) binaries at cosm ological distances 
ijjaffe fc Bac ker 2003: Sesana et al.|[2003 ). They have shown 
that a positive detection of the GW background is feasi- 
ble, if one uses state of the art pulsar timing technologies. 
Such encouraging results triggered consequent observational 
efforts. 

At present, several groups are trying to detect GWs us- 
ing PTAs: i) the European Pulsar Timing Array (EPTA ; 
IStappers etafl I2OO6I : iLazaridisI I2OI0I : iFerdman et afl l20ld : 
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The Sardinia Radio telescope is in the commissioning phase at 
the time of writing this this paper. 
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nal? How much will we benefit from such optimization? In 
this paper, we try to answer these questions. The paper is 
organiz ed as follow s : In S ection [2l we extend the formal- 
ism of IJenet et all (|2005h to calculate the GW detection 
significance as a function of observing schedules, i.e. the 
telescope time allocation to each pulsar. Then we describe 
the technique to maximize the GW detection significance 
in Section |3] Frameworks of the optimization problem are 
described in Section 13.11 and the algorithm to optimize a 
single and multiple telescope array are given in Section [3.21 
and Section 13.31 respectively. The results are presented in 
Section U and we discuss related issues in Section [S] 



2 ANALYTICAL CALCULATION FOR GW 
DETECTION SIGNIFICANCE 

In this section, we calculate the statistical significance S 
for detecting the stochastic GW background using PTAs. 
We consider TOAs from multiple pulsars, where each set 
may be collected from different telescopes or data acquisition 
systems. To detect the GW background, one correlates the 
TOAs between pulsar p airs and che c ks if the GW-induced 
correlation is significant. Ijenet et al.l (|2005l ) have calculated 
the GW detection significance for the case, where the noise 
in TOAs is of a white spectra with equal root-mean-square 
(RMS) level for all pulsars. To investigate the optimal ob- 
serving schedule, we have to generalize the calculation, such 
that we can explicitly check the dependence of the GW de- 
tection significance on the noise properties of each individual 
pulsar. 

Under the infiuence of a stochastic gravitational wave 
background, the pulsar timing residual R from a standard 
pulsar timing pipeline contains two components, the GW- 
induced signal s and noise from other contributions n. In 
this section, we determine the statistical properties of a and 
n first, and then calculate the GW detection significance. 

2.1 Statistics for GW-induced pulsar timing signal 

The spectrum of the stochastic GW background is usually 
assumed to be a power-law, in which the characteristic strain 
{he) of the GW background is he = Ao{f / fo)" ■ Here, Ao is 
the dimensionless amplitude for the background at fo — 
lyr~^, and a is the spectral index. Under the influence of 
such a GW background, the power spe ctrum Ssif) of th e 
GW-induced pulsar timing residual s is (|jenet et al.ll2005l ) 



Ssif) 



12^2/2- 



(1) 



GWs perturb the space-time metric at the Earth. This 
introduces a correlation in the timing signal of two pulsars. 
The correlation coefficient H{6) between the GW-induced 
signals of two pulsars with an angular separation of 6 is 
called the Hellings and Downs function (fHellings fc Downa 
1 19831 ) given as 



H{6) 



3-l-cos e _ 3(cos fl-1) 
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1, if 6* = 0. 
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The spectral properties, equation ([T} , together with the spa- 
tial correlation, equation ([2]), fully characterize the statisti- 
cal properties of the GW-induced signals. 



For an isotropic GW background, the correlations be- 
tween the GW-induced signals are 



(3) 



Here, we follow the notation that the subscript on the right 
is the index of sampling and the superscript on the left is 
the index for the pulsar. For example, we denote the fc-th 
measurement of a timing residual of the i-th pulsar as ^Rk, 
the GW-induced signal as 'sfe and other noise contributions 
as 'jift . (Tg is the RMS level for the GW-induced signal, ^•'9 is 
the angular distance between the i-th and j-th pulsar, and 
7fcfc' is the temporal correlation coefficient between the fc-th 
and fc'-th sampling. (Tg and 7^,4./ are numerically calculated 
from the GW spectrum as shown in Appendix 1X1 



2.2 Statistics of noise components from other 
contributions 

A purely theoret ical modeling of the noise part n is com- 
plex, because it llFoster fc Backerlll990l : [Shannon fc Corded 
I2OI0I : ICordes fc ShannonlbOlol T depends on the properties 
of each individual pulsar, the instrumentation, and radio 
frequency interference. We therefore model the noise phe- 
nomenologically using observational knowledge. In this pa- 
per, the noise part of a pulsar timing residual is modeled as 
a superposition of a white noise component and a red noise 
component, where the white noise is designated to the mea- 
surement uncertainty on the TOA du e to radiometer noise 
and pulse jitter noise l|Liu et al.ll2012l ). Timing residuals of 
several millisecond pulsars show clear evidence of tempo- 
rally correlated nois e, although its origin is not yet clear 
jVerbiest et al.ll2009l ). The red noise components are used 
to empirically model such effects. We further assume that 
the noise components are not correlated between any two 
different pulsar^. 

For each pulsar, three parameters are used to charac- 
terize the noise spectrum. These parameters are the RMS 
level of the white noise Vw, the RMS level for red noise Vr, 
and the spectral index for the red noise 75- The white noise 
spectrum is 



and the red noise spectrum is 



(Tw 



(4) 



(5) 



where the ct^ and ov are for normalization. From the spec- 
trum, one can derive the correlation between noise compo- 
nents 



fc' + 'f^igkk'^ Si. 



(6) 



Following our conventions, ^Uk, is the noise for the fc-th sam- 
pling of the i-th pulsar. The Sij is the 'Kronecker delta' sym- 
bol, i.e. Sij = 1, if i = j, otherwise Sij = 0. The parameters 
5^, (Tr, and correlation coefficients Qkk' are calculated using 
a numerical simulation shown in Appendix 1X1 



2 The correlated noise such as clock noise is discussed in Section[5l 
and Appendix [E] 
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2.3 GW background detection significance 

Following [jenet et all Hooi), we calculate the cross power 
'■'c between timing residuals and then compare it with the 
predicted correlation coefficient H{6) to check whether the 
GW signal is significant. The cross power '-'c is 



m ^ — ^ 



(7) 



where ^Rk, ■'Rk are the timing residuals of the z-th and the 
j-th pulsars for the fc-th timing session, m is the number of 
data points for a given pulsar. It is assumed that the data 
from different pulsars overlap with each other and the num- 
ber of data point s is identical fo r all t he pulsars, similar to 
the discussion in lYardlev et al.l ()201lh . These assumptions 
are good approximations in calculating the GW detection 
sensitivity. In the case where the TOAs are mis-aligned or 
the number of data points is not identical, one can combine 
data points so that the number of data points are identical. 
This operation retains most of the GW detection sensitivity 
due to: i) The RMS error reduces when data is combined, 
and such a reduction in RMS error compensates the reduc- 
tion in the number of data points, ii) The spectrum of the 
GW induced timing signal is steep, thus most of the GW 
induced signal is in the low frequency components, which 
are preserved during the combining operation. 

The comparison between the '-'c and the Hellings- 
Downs function is carried out by doing another correlation, 
which gives the GW detection significance S as 



(8) 

where the summation E - . sums over all independent 

'2 — J pairs ^ 

pulsar pairs except the case where i = j, i.e. 

N i~l 

E -EE' (9) 



and 
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(11) 



Given pulsars, the sum M = Ei-j pairs 1 = ~ l)/2 

is the number of independent pulsar pairs. To evaluate the 
quality of the detector, we need the expectation for the de- 
tection significance (S), which is 



where 



_l_ 

M 
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(12) 



(13) 



^ E [cm-{c)]A, (14) 

i — j pairs / 

and the (■) denotes the ensemble average. As we show in 



Appendix [B] the expected GW detection significance is 

_ 1 
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where 



+ {^\,kk' + ^Vr.kk'^ Ikk' + V.fefe' V,fefc'] , (16) 

'■'-B = — ( V + ^?7w + V ^7?w + % ^?7w + V , (17) 
m \ J 

and r\ denotes the ratio between the power of noise compo- 
nents and the power of the GW-induced signal, which are 



^r,fcfe' — 



(18) 
(19) 
(20) 



If the noise level Vn is identical for all pulsars and there is 
no red noise component ( Vw = "'ctw and Vr = for all the 
i,.7' = 1 . . . TV), equ atio n ([151) reduces to the result found by 
iJenet et all (|2005") and' Verbiest et all (|2009D . 

The ^■'A terms arc independent of telescope integration 
time, because they only contain the RMS level of the GW- 
induced signal and the red noise, both of which are indepen- 
dent of telescope integration time. The observing schedule, 
the plan of allocating telescope time to each pulsar, only 
changes terms of '-'-B, which depend on the ratio between 
signal and noise amplitude. By optimizing the observing 
schedule, we can reduce the summation of '-'_B, such that 
the GW detection significance is maximized. We present the 
technique to maximize the GW detection significance in the 
next section. 



3 OPTIMIZATION OF THE GW DETECTION 
SIGNIFICANCE UNDER OBSERVING 
CONSTRAINTS 

Pulsar timing observations are usually conducted as a se- 
ries of successive observing sessions. In each session, multi- 
ple pulsars are observed using either one or multiple tele- 
scopes. There are basically two ways to use multiple tele- 
scopes. The simple way is to use each telescope indepen- 
dently, combine the TOA data from each telescope, remove 
the time jumps between each data set, and form a single 
TOA data set (jjanssen et ahllioOSl l. The other way, as in 
the LEAP project, is to use telescopes simultaneously to 
form a phased array and then calculate the TOAs from the 
phased-array data. Due to such different methods of using 
multiple telescopes, the optimization techniques differ from 
one another. We answer the following questions in this sec- 
tion: i) What are the variables to optimize? ii) What are the 
constraints on the optimization? iii) How do we perform the 
optimization? 
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3.1 The objective, variables, and constraints of 
the optimization 

Given a fixed amount of telescope time, we can adjust the 
amount of observing time allocated to each pulsar. Increas- 
ing the observing time for one pulsar will reduce its timing 
measurement error, but will increase the timing error for 
other pulsars. Naturally, for the purpose of detecting GWs, 
the optimization objective is to maximize the expected GW 
detection significance (S), while the constraint is the total 
amount of telescope time for the project. 

Generally, for a timing project using A'^tci telescopes to 
observe A'' pulsars, we need 2^61 + "iN input parameters to 
characterize the whole timing project. 2A'tei parameters are 
used to characterize A'td telescopes, where each telescope is 
quantified by the gain {Q) and the total available telescope 
time (r). 3A*' parameters are used to characterize the timing 
behavior of A'^ pulsars. The red noise level and spectral 
index j3 are assumed to be pulsar intrinsic. The observed 
white noise RMS levels (Jw depend on telescope gain, tele- 
scope time allocation to the pulsar and other parameters 
intrinsic to the pulsar (e.g. flux, pulse width, and so on). 

For single telescope cases, we can encapsulate the de- 
pendence of the white noise level on the schedule into 
the norm alized white nois e leve l gp and pulse jitter noise 
level g.T (iFoster fc Backeil Il99d : IShannon fc Corded l2010l : 
ICordes fc ShannonlbOlOt T 



-2 , i 2 
+ 0-,7 



1/2 
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(21) 



where Vw is the measured RMS level for the white noise 
component of the i-th pulsar, V is the telescope time being 
used for the z-th pulsar per observing session, and Q is the 
telescope gain. The normalized noise level Vq and the Vj 
are used to characterize the radiometer noise and the pulse 
jitter noise of the pulsar. On the one hand, if there is no 
pulse jitter noise, (Jq will be the observed RMS level of the 
white noise for a 1 hour observation using a telescope with 
unit gain = 10. On the other hand, if we have a telescope 
with infinite gain, the pulsar timing accuracy will be limited 
by the pulse jitter, and crj will be the observed RMS level 
of white noise for a 1 hour observation. If phased array ob- 
servations are performed using multiple telescopes (as done 
in the LEAP project), the situation is identical to the case 
of a single telescope, and we use the effective gain for the 
array to determine the RMS level of noise. 

For multiple incoherent telescopes, the gain of A^tci 
telescopes can be summarized by a vector Qv, where v = 
1 . . . A^tci and the v-th component Qi, is the gain for the u-t\i 
telescope. In a similar fashion, the available telescope time 
of each telescope is summarized by the vector Ti, . The defini- 
tion of the observing schedule becomes more complex, since 
we need to specify the observation time for each pulsar us- 
ing each telescope. Furthermore, the schedule should also 
include information on the telescope availability, e.g. cer- 
tain pulsars may not be visible to some telescopes due to 
geographical reasons. In this paper, we use the resource 
allocation matrix O to describe the telescope availability. 



In tills paper, we use a fiducial unit for tiie gain, tiius one can 
take any telescope as unit 1 and scale other telescopes accordingly. 



where the i-th row and !/-th column element indicates 
whether we use the u-th telescope to observe the i-th pulsar, 
i.e. ^Ov = 1, if the v-th telescope observes the i-th pulsar, 
and ^Ov = otherwise. The telescope time allocation is de- 
scribed by another matrix, the schedule matrix P, where the 
i-th row v-ih column element *Pi/ is the time allocated for 
the v-t\i telescope observing the i-th pulsar. 

With the schedule matrix P, the equivalent RMS level 
of the white noise in the combined data is 

"(22) 

The main reason to introduce the resource allocation matrix 
^Ou is to take care of the complexity of telescope availability, 
and the telescope time can be treated in an identical way 
independent of the availability. 

For a single telescope or a phased array, the optimiza- 
tion constraint is 



(23) 



i.e. the total telescope time is pre-fixed to be r. For mul- 
tiple incoherent telescopes, the above constraint is applied 
to each telescope individually, i.e. the constraints specify the 
available observation time for each telescope, which gives 



(24) 



With the observing schedule (i.e. the vector V for a sin- 
gle telescope or the matrix ^Ov and ^P^ for incoherent tele- 
scopes), one can use equation (|21|l and (|22|l to determine 
the white noise level, then determine the GW detection sig- 
nificance as explained in Section[2]3] The optimization of the 
observing schedule means that we choose an appropriate V 
or Pu such that the expected GW-detection significance (S) 
is maximized under the constraint of equation (|23|l or (|24|l . 

From equation (|15p . the maximization of (5*) is equiva- 
lent to the minimization of the term ■ • ('M -I- '-'B). 
Because the terms of ^•'A are independent of the telescope 
time, the maximization of (S) by adjusting the observing 
schedule is thus equivalent to minimizing the objective func- 
tion C defined as 



(25) 



3.2 Optimizing a single telescope 

Our technique to optimize the single telescope schedule takes 
two steps. The first step is to convert the constrained opti- 
mization problem to a constraint-free version, and the sec- 
ond step is to solve the constraint-free optimization. For 
comparison, we have also developed an alternative semi- 
analytical iterative technique in Appendix [Cl 

To remove the constraints, we transform variables V to 
a new set of variables , where ^ is 1 . . . N ~ 1 , and the 
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transformation is 



/ ^1 \ 
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(26) 



of which the inverse transformation is 
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(27) 



Here it is implicitly assumed that V ^ 0, the Y\i s-nd Ilfi 
are the serial products using the index i and fj, respectively. 
Equations (|26p and (|27|) are, in fact, the transformation be- 
tween a Af-dimensional Cartesian coordinate system and its 
corresponding hyperspherical coordinate system. The con- 
straint in the new spherical coordinate system corresponds 
to fixing the radius of the hypersphere (fixing r), and all 
angular coordinates 6^ are free variables for which one can 
specify any value for 9^ without breaking the constraint, 
i.e. after the above coordinate transformation, the objective 
C becomes a function of a new variable 6^, the constraint 
equation 1)23^ is automatically satisfied for any 6^. We can 
then find the minimum of C using numerical methods for 
constraint-fre e problems. In this p aper, the downhill sim- 
plex method IjNelder fc Meadlll965l ) is adopted, which has, 
usu ally, better global co nverging behavior than other meth- 
ods (H ramer et al.lll994 ). Once the optimal 6^ are found, we 
transform them back to V using equation (|27p . which yields 
the optimal single telescope schedule. In practical situations, 
one also needs to add telescope slewing time, observing time 
for calibration sources and other necessary auxiliary time on 
top of this telescope time schedule to get the final schedule. 



3.3 Optimizing multiple incoherent telescopes 



Similarly to section [T2l the optimal observing schedule for 
multiple incoherent telescopes is calculated by minimizing 
C, although the constraints are slightly more complex here. 
In this section, we present the optimizing technique and dis- 
cuss later the relation between the optimization of multiple 
telescopes and single telescope optimization. 

From the multiple telescope constraint equation (|24p . 
we can see that the constraints are applied to each tele- 
scope individually. Thus, the generalization of the method 
presented in the previous section is straightforward by ap- 
plying the transformation equation (|26|) to each telescope 
separately. Take telescope 1 as an example. The first col- 
umn of matrix P, the ^Pv, is the observing schedule for the 
first telescope. We can transform those components of ^P^ 
indicated by ^Oi, = 1 using equation (|26p to remove the con- 
straint of the first telescope. Similarly, by applying the trans- 
formation to the other columns of matrix P successively, one 



can remove all the constraints. With the new constraint-free 
variables, we use the down-hill simplex method to find the 
optimization. We then transform back to *P^, which is the 
optimization schedule. 

In the optimization algorithm, we treat the resource 
allocation matrix O as input knowledge. A question natu- 
rally arises as to whether one can find a better observing 
schedule for the same telescopes with the same amounts of 
telescope time but with a different O, i.e. whether one can 
add pulsars to or remove pulsars from schedules of certain 
telescopes to increase the detection significance? The con- 
figuration with all 'Oi/ = 1 allows one to use any telescope 
to observe any pulsar, i.e. allows one to adjust the sched- 
ule with the maximal degrees of freedom. In this way, the 
optimization schedule of configuration ( ^Ov = 1) leads to 
the highest GW detection significance. If any schedule has 
the same detection significance as the optimal schedule with 
all *Oi/ = 1, we call such schedule 'global optimal'. To de- 
termine whether the global optimization is achievable, we 
first investigate the case when the pulse jitter noise can be 
ignored, we then discuss situations where the pulse jitter 
noise becomes important. 

When the pulse jitter noise is neglected, there is a close 
relation between the single telescope optimization and the 
multiple telescope optimization. In fact, under certain condi- 
tions, the optimization for multiple telescopes is equivalent 
to the single telescope optimization. To see this, we replace 
the variables 'P^ and 'O^ in equation (I22p by effective tele- 
scope time Ve as follows 



Ve = Ql 'p. V, . 

After ignoring the crj, equation (|22|l becomes 



(28) 



i i i -1/2 

0"w = O"0 T;, , 



(29) 

and the constrains equation (|24p reduces to a single con- 
straint 



(30) 



By comparing equations (|29p and (|30p with equation (|23|l 
and (|21|) . one can see that the optimization for multiple tele- 
scopes is very similar to the single telescope optimization. In 
fact, if there exists a unique solution of 'P^ to equation (|28p . 
which satisfies each individual constraint of equation (I24p . 
the multiple telescope and single telescope optimization are 
mathematically identical. 

The differences between multiple telescope and single 
telescope optimization lie in the difference between the con- 
straints equation H24p and (|3Up . For the single telescope 
case, only one constraint (equation I23p is involved. This is 
very different from the case of multiple telescopes, where 
Ntci constraints (equation I24p are present. The variable 
substitution in equation psp combines all iVtei constraints 
(equation [24J and forms a single constraint (equation [30)) , 
where the constraint of each individual telescope is ignored. 
Whether global optimization is achievable is, now, equiv- 
alent to whether one can find a solution to 'P^ for equa- 
tion (|28p . while satisfying all individual constraints (equa- 
tion [24}. 

Generally, when solving 'P^ from equation (|28p and 
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one meets three types of situations: a unique solution, 
multiple solutions and no solution. For the case of multiple 
solutions, there are multiple choices for the optimal sched- 
ule. All these configurations are identical in the sense that 
they give the same GW detection significance. The case of 
no solution can only arise when constraints for some tele- 
scopes cannot be met, i.e. some of the pulsars need more 
time than the telescopes can give, while some of the pulsars 
have more telescope time than should be assigned. Take the 
case with two telescopes and two pulsars as an example, 
where the gain of the two telescopes are the same, two pul- 

1 1 " 



sars have identical ctq, and 'O^ — 







i.e. only the 



second telescope can observe the second pulsar. Since the 
two pulsars have the same (tq, the total effective telescope 
time should then also be the same for the optimal schedule 
(i.e. ^Te = "^Tc). However if the first telescope does not have 
enough time, one gets < and the global optimal 
schedule is not achievable for such configurations. 

The case for which there no solution to equation (|28p . 
is due to an improper choice of telescopes. Most of the time 
the matrix O is determined by the sky coverage of the tele- 
scopes. Thus, if no solution can be found, one needs to seek 
telescopes with the appropriate sky coverage or extend the 
time for telescopes with the sky coverage. In order to identify 
such a no-solution situation we show in Appendix |D] that a 
solution to equation (|28[) exists, if the following conditions 
are satisfied 



Ve ^ Q^Ti, 'Oi/ , for any index of pulsar 



(31) 



One can identify which pulsar in equation H3ip fails. These 
failures indicate the corresponding elements of resources al- 
location matrix to be adjusted. 

We now consider the situation where the pulse jitter 
noise becomes important. Clearly, if we cannot ignore the 
pulse jitter noise, the effective telescope time is no longer 
linearly dependent on the telescope time 'P^ as in equa- 
tion (|28[) and we do not have a simple method to check if 
the global optimization is achieved. However we can still set 
all the *Oi/ = 1, find the global optimal strategy, and com- 
pare it with the optimal strategy for the input ^Ov to check 
if the global optimization is achieved. 

In Figure [U we give four examples for the optimization 
of multiple telescopes. In these examples, two telescopes are 
used to observe two pulsars, where the pulsar noise param- 
eters and the telescope parameters are specified in Table [T] 
These four examples are given as follows, i) 'Case A', two 
identical pulsars are observed with two identical telescopes. 
The global optimal schedule is achievable and one can ex- 
change telescope time between the two telescopes. As indi- 
cated in Figure [T] as long as we assign the same amount of 
effective telescope time Ve for two pulsars, it is the optimal 
observing schedule, ii) 'Case B', telescope 1 has twice the 
gain compared to telescope 2. Similar to 'case A', the global 
optimal schedule is achievable and one can exchange tele- 
scope time. But due to the gain differences, the telescope 
time exchange should be weighted by the gain, such that 
the same amount of effective telescope time is assigned to 
the two identical pulsars, iii) 'Case C, where telescope 1 has 
more time (1.5 hr) available compared to telescope 2 (1 hr) 



and only telescope 2 can be used to observe the 2nd pulsar 
(as indicated by the matrix ^O^). For this case, the total tele- 
scope time is the same as in 'case A', but we do not achieve 
the same low level of C as in 'case A', i.e. global optimization 
is not achievable. This is due to the constraint from matrix 
^Ov, which prevents us from reaching the global optimiza- 
tion as discussed, iv) 'Case D', where pulsar 1 is affected by 
pulse jitter noise. In this case, one does not have the freedom 
to exchange telescope time and optimization suggests that 
the low gain telescope (telescope 2) should spend more time 
on the pulse jitter affected pulsar (pulsar 1). 



4 RESULTS 

As examples, we use pulsar properties measured from data 
of the PPTA, EPTA and NANOGrav to show the potential 
benefit of optimizing the observing schedule. The parame- 
ters of the pulsars are given in Table [51 

Figure [2] shows the comparison between the GW detec- 
tion significance {S) for an unoptimized and an optimized 
PTA with the same parameters. We can see that the optimal 
observing strategy increases the GW detection significance. 
Evaluating from the rising edge of the curve, the optimized 
arrays are able to detect a GW background 2-3 times weaker 
than unoptimized arrays depending on the pulsar popula- 
tion. Because the radiometer noise aoQ~^ dominates over 
the pulse jitter noise for most pulsars using current timing 
techniques, we ignore the pulse jitter noise in these exam- 
ples. 

We also note that the larger the red noise level is, the 
less we gain from optimizing the observing schedule. When 
the amplitude of intrinsic red noises is large, the dom- 
inates the detection significance as shown in equation (|15p . 
and thus the detection is no longer sensitive to the schedule 
optimization, which only affects the ^^B terms. In fact, only 
if the pulsar noise level is sensitive to the observing sched- 
ule (i.e. when red noise does not dominate), the optimization 
will be effective. This conclusion applies to any type of GW 
detector. 

In the optimization process, one always uses a numerical 
technique to determine the optimal observation plan. It is, 
nevertheless, worth finding a rule of thumb to determine an 
'approximate' optimal strategy. We prove in Appendix [C] 
that, for strong GW cases, the optimal observing schedule 
weakly depends on the amplitude of the GW background 
and a good approximation for the optimal schedule is 



where the is defined as 



i 2 /-» — 2 . 2 



(32) 



(33) 



Here the 'Q, noise parameter, defines the noisiness of the 
i-th pulsar observed using a telescope of gain Q. 

By comparing the numerical optimization and the result 
from equation (|32p in Figure O we show that the optimal 
schedule is insensitive to the GW amplitude and can be 
well approximated by equation (|32|l . when the amplitude of 
the GW is large. For the case of a weak GW background, 
the optimal schedule for most of the pulsars is still close 
to equation (|32|) . but the optimal schedule for noisy pulsars 
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Table 1. The telescope configurations and pulsar noise parameters for each case given in Figure[T] ctq and cj take the unit of second, r 
takes the unit of hour, and the gain Q take an arbitrary fiducial unit as explained in the main text. 



case A: 



case B: 



Radiometer noise 
Jitter noise 
Observation time 
Telescope gain 



: 10" 
"fTj =0 

n = 1 
Gi = 1 



Resource allocation matrix ^Oi, 



'(TO : 



10-^ 
=0 
T2 = 1 
02 = 1 



Iffj = 

n = 1 
6i =2 



CO 



: 10" 
^<7J =0 

r2 = 1 
92 = 1 



case C: 



case D: 



Radiometer noise 
Jitter noise 
Observation time 
Telescope gain 

Resource allocation matrix 



- 10- 
= 
Tl = 1.5 

Gi = 1 



(TO 



CO 



10-^ 
^o-j = 
T2 = 0.5 
G2 = 1 



Icro = 10-7 
l(Tj = 10-7 

ri = 1 

Gi=2 

1 1 



1 1 



o-o 



: IQ- 

"tj =0 

T2 = 1 

G2 = 1 



Table 2. Parameters for PPTA, EPTA and NANOGrav pulsars taken from Hobbs et al. (2010). We assume that all the white noise is 
due to the radiometer noise and pulse jitter noise can be ignored. 



PSR J 


P 


Pb 


S1400 


Array 


PPTA era 


EPTA cro 


NANOGrav ctq 




(ms) 


(d) 


(mjy) 






(ps) 




J0030+0451 


4.87 


_ 


0.6 


EPTA, NANOGrav 


_ 


0.54 


0.31 


J0218+4232 


2.32 


2.03 


0.9 


NANOGrav 






4.81 


J0437-4715 


5.76 


5.74 


142.0 


PPTA 


0.03 






J0613-0200 


3.06 


1.20 


1.4 


PPTA, EPTA, NANOGrav 


0.71 


0.45 


0.50 


J0621+1002 


28.85 


8.32 


1.9 


EPTA 




9.58 




J0711-6830 


5.49 




1.6 


PPTA 


1.32 






J075H-1807 


3.48 


0.3 


3.2 


EPTA 




0.78 




J0900-3144 


11.1 


18.7 


3.8 


EPTA 




1.55 




J1012+5307 


5.26 


0.60 


3.0 


EPTA, NANOGrav 




0.32 


0.61 


J1022+1001 


16.45 


7.81 


3.0 


PPTA, EPTA 


0.37 


0.48 




J1024-0719 


5.16 




0.7 


PPTA, EPTA 


0.43 


0.25 




J1045-4509 


7.47 


4.08 


3.0 


PPTA 


2.68 






J1455-3330 


7.99 


76.17 


1.2 


EPTA, NANOGrav 




3.83 


1.60 


J1600-3053 


3.60 


14.35 


3.2 


EPTA, PPTA 


0.32 


0.23 




J1603-7202 


14.84 


6.31 


3.0 


PPTA 


0.70 






J1640+2224 


3.16 


175.46 


2.0 


EPTA, NANOGrav 




0.45 


0.19 


J1643-1224 


4.62 


147.02 


4.8 


PPTA, EPTA, NANOGrav 


0.57 


0.56 


0.53 


J1713+0747 


4.57 


67.83 


8.0 


PPTA, EPTA, NANOGrav 


0.15 


0.07 


0.04 


J1730-2304 


8.12 




4.0 


PPTA, EPTA 


0.83 


1.01 




J1732-5049 


5.31 


5.26 




PPTA 


1.74 






J1738+0333 


5.85 


0.35 




NANOGrav 






0.24 


J1741+1351 


3.75 


16.34 




NANOGrav 






0.19 


J1744-1134 


4.08 




3.0 


PPTA, EPTA, NANOGrav 


0.21 


0.14 


0.14 


J1751-2857 


3.91 


110.7 


0.06 


EPTA 




0.90 




J1824-2452 


3.05 




0.2 


PPTA, EPTA 


0.39 


0.24 




J1853+1303 


4.09 


115.65 


0.4 


NANOGrav 






0.17 


J1857+0943 


5.37 


12.33 


5.0 


PPTA, EPTA, NANOGrav 


0.82 


0.44 


0.25 


J1909-3744 


2.95 


1.53 


3.0 


PPTA, EPTA, NANOGrav 


0.19 


0.04 


0.15 


J1910+1256 


4.98 


58.47 


0.5 


EPTA, NANOGrav 




0.99 


0.17 


J1918-0642 


7.65 


10.91 




EPTA, NANOGrav 




0.87 


1.08 


J1939+2134 


1.56 




10.0 


PPTA, EPTA, NANOGrav 


0.11 


0.02 


0.03 


J1955+2908 


6.13 


117.35 


1.1 


NANOGrav 






0.18 


J2019+2425 


3.94 


76.51 




NANOGrav 






0.66 


J2124-3358 


4.93 




1.6 


PPTA 


1.52 






J2129-5721 


3.73 


6.63 


1.4 


PPTA 


0.87 






J2145-0750 


16.05 


6.84 


8.0 


PPTA, EPTA, NANOGrav 


0.86 


0.40 


1.37 


J2317+1439 


3.44 


2.46 


4.0 


NANOGrav 




0.81 


0.25 
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Figure 1. Four examples to illustrate the optimization of multiple telescopes. Here, we plot £, defined in equation |(25}, as a function of 
telescope time. The smaller C is, the better the observing schedule. In all examples, two pulsars are observed with two telescopes. The 
telescope configurations and pulsar noise parameters for each case is given in Table[l] The x-axis and y-axis are for the ^ri, i.e. the time 
for the 1st telescope observing the 1st pulsar, and '^T2, i.e. the time for the 2nd telescope observing the 2nd pulsar, respectively. 



(pulsars with larger *Q) starts to deviate from the analytic 
approximation. 

There are a few more general conclusions on improv- 
ing the timing accuracy independent of GW detection al- 
gorithms. In order to improve timing accuracy, one can use 
telescopes with higher effective gaiqj or increase telescope 
time. Increasing the telescope gain reduces (t„ and increas- 
ing telescope time reduces both o"w and aj. The example of 
case 'D' in Figure [T] shows that one should use a low gain 
telescope on those pulsars with larger jitter noise level, while 
using a high gain telescope on pulsars with larger radiometer 
noise. This is further supported by the results in Figure |4l 
which shows that increasing the telescope time is more ef- 
fective than using a high gain telescope for pulse jitter noise 
dominant pulsars. 

Besides providing the optimal schedule to detect a GW 
background, our technique answers the question about the 
optimal number of pulsars one should observe in a PTA with 
a given amount of telescope time. The number of pulsars in 
a PTA has two effects on the GW detection significance. 
Firstly, from equation (|15p . one can see that the signifi- 
cance increases as {S) oc -/a7 ~ TV. Secondly, since the 

* Here, telescopes with higher effective gain Q can be telescopes 
with larger collection area, lower system temperature, wider 
bandwidth and so on. 



telescope time is limited, observing more pulsars increases 
the TOA noise level, i.e. cr^ cx A'^"'^ given a fixed amount 
of telescope time. When N becomes large, the two effects 
mentioned above cancel each other out, and the detection 
significance becomes insensitive to N . In general, when the 
number of pulsars (A) is small, the detection significance 
increases with N. If all pulsars have the same noise level, 
the detection significance saturates for large N , where the 
saturation level is mainly determined by the available tele- 
scope time. In practice, when trying to include more pulsars 
in a timing array, pulsars with a higher noise level will be 
inevitably included, such that the detection significance will 
decrease for any GW detection algorithm. In this way, ob- 
serving more pulsars does not necessarily help detecting the 
GW background, unless one gets more telescope time. Given 
the telescope time, the number of pulsars, at which the GW 
detection significance achieves its maximum, is the optimal 
number of pulsars one should use in the PTA. We propose 
the following algorithm to determine the best sample of pul- 
sars to observe. 

(i) From a group of to-be-observed pulsars, choose the two 
pulsars with smallest noise levels, then optimize the sched- 
ule, and compute the GW detection significance. 

(ii) Include one extra pulsar, optimize the schedule, com- 
pute the GW detection significance, loop over the rest of 
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Figure 2. The GW background detection significance for PPTA, EPTA and NANOGrav with 5 year bi-weekly (one session every two 
weeks) observations. The white noise levels cro a^re taken from Table[2]and the red noise levels Vr are given above each panel. The x-axis 
shows the characteristic strain Aq of the GW background with spectral index of Q = —2/3. The y-axis shows the expected detection 
significance (S). The solid lines, dashed lines, and dashed-dotted lines are for PPTA, EPTA and NANOGrav respectively. The thick 
lines are the optimized cases, while the thin lines are the unoptimized versions. Here, the constraint is the observation time, i.e. for each 
project, the total amount of observing time of each session is fixed to be 20 hours. If the red noise level is zero, the optimized array is 
able to detect 2-3 times weaker GW signals compared to its unoptimized version depending on the pulsar population. 



the pulsars and find the new pulsar leading to the largest 
detection significance. 

(iii) If the GW detection significance increases by includ- 
ing the new pulsar, add this pulsar to the list, and repeat 
the second step for the rest of the pulsars, otherwise the 
optimal set of pulsars is already achieved. 

When pulsar surveys discover a new pulsar, we can 
check whether it is worth including it in the PTA by us- 
ing the above algorithm, although, to start with, sufficient 
observations are still necessary to measure the noise prop- 
erties of this pulsar. 



5 DISCUSSIONS AND CONCLUSIONS 

In this paper, we have investigated a technique to optimize 
the allocation of telescope time in a pulsar timing array 
to maximize the GW detection significance given a fixed 
amount of telescope time. This is done in two steps. First, 
the GW detection significance using a correlation detector 
is calculated analytically as a function of the white noise 
and red noise level of each pulsar, i.e. equation (|15p . Sec- 
ondly, the GW detection significance is optimized under the 
constraints that the allocated telescope time is fixed. The 
constrained optimization is converted to a corresponding 
constraint-free optimization using the coordinate transfor- 



mations of equation H26p and (|27|l . Finally, the optimization 
is solved numerically using the downhill simplex algorithm. 
For characteristic PTAs, the optimized arrays are able to de- 
tect a GW background 2-3 times weaker than unoptimized 
arrays depending on the pulsar population. Besides the sin- 
gle telescope case, we also derive the optimization algorithm 
for multiple telescopes. We also examine the links between 
the multiple and single telescope optimization. We investi- 
gate the optimal number of pulsars to observe for a given 
PTA, where the algorithm to construct the optimal group 
of pulsars from candidates is also included. 

In our optimization, the total telescope time r and t,_, 
are input parameters, that need to be determined before 
optimizing the schedule. When defining a PTA project, one 
can start with a reasonable amount of telescope time, say 
20 hours per each session/telescope, optimize the schedule, 
calculate the detection significance {S), check if the detec- 
tion is sensitive enough to the predicted GW background, 
and then adjust the input total telescope time accordingly, 
i.e. increase total telescope time, if the PTA is not sensitive 
enough. 

In practice, the detailed numerical values for the opti- 
mal observing strategy are still to-be-determined, due to the 
lack of measurements for the necessary pulsar noise parame- 
ters. To determine the realistic optimization strategy, these 
parameters are critical. A detailed investigation on the indi- 
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Figure 3. The optimal telescope time for each pulsar as a func- 
tion of its white noise level. The x-axis shows the one-hour pulsar 
noise level Vq- The y-axis shows the optimal telescope time for 
the pulsar per observing session. The EPTA pulsars are used in 
this demonstration, where the total telescope time is 24 hours, 
i.e. the average telescope time for each pulsar is 1 hour. Wc 
indicate Vq for each pulsar using a vertical line on the top 
of the figure. The dotted-dashed line, dashed line, dotted line 
and solid line are for optimization results with GW amplitude 
of Ao = 10-12,10-1^,2 X 10-13^ and 10-^'' respectively. The 
results of equation l|32p overlaps with the dotted-dashed line. 
For GWs with amplitude between lO-^^ and IQ-l^, the optimal 
schedules are very close to each other. For the weaker GW cases, 
e.g. Ao = 10"!*, the optimal schedule starts to deviate from the 
approximation equation I I32II . Such a deviation is mainly due to 
the pulsars with a high noise level. Since these noisy pulsars will 
not contribute significantly to the GW signal detection, the op- 
timal algorithm starts to reduce its observing time. For most of 
the pulsars, the optimal schedule is still close to the result from 
equation (|32j. 




Log(A„) 

Figure 4. The GW detection significance (S) (y-axis) as a func- 
tion of GW amplitude log{Ao) (x-axis), given different telescope 
gain and amount of telescope time. In the calculation, we as- 
sume that the PTA is composed of 20 pulsars, all of which have 
CTQ = 100 ns and cj = 100 ns. We plot three telescope configura- 
tions here, each of which is labeled with the numerical value of 
total telescope time r and telescope gain G- Here the telescope 
time is in units of hours, the gain takes an arbitrary fiducial unit 
as in equation I I21I I. For the following two PTA configurations, 
the {g = 10,r = 20} and the {g = l,r = 20 X 10^}, all pul- 
sars have the same level of radiometer noise, which is 10 times 
smaller than the case where {(? = 1, r = 1}. However, because of 
the pulse jitter noise effect, the configurations with more telescope 
time performs better than the cases with higher system gain. Fur- 
thermore, by comparing {Q = 10, r = 20} case with the case of 
{g = 1,T = 20 X 10^}, we conclude that it is important, even 
for a high gain telescope, to acquire sufficient telescope time in 
order to improve the GW detection significance, when the pulsar 
becomes pulse jitter noise dominant. 



vidual timing properties of potential PTA pulsars is highly 
necessary, from which further observations will benefit. 

One may encounter the situation, in which the optimal 
strategy requires longer observing time for certain pulsars 
per session than their transit time. If this happens, one has 
to split one observing session into several successive sessions. 
The GW detection significance is not significantly impaired 
by session splitting, since the GW induced timing signal has 
a very red spectrum and we are detecting its low frequency 
component. However the session splitting can lead to incon- 
veniences in practical arrangements. A better way to avoid 
such a situation is to construct the PTA using telescopes 
with enough geographical coverage, which can be one of the 
driving reasons for the IPTA project. 

The optimization in this paper is designed to maximize 
the GW detection significance. Our o ptimization is buil t for 
the correlation detector proposed bv ljenet et al.l (|2005l ). As 
shown in Appendix [Cl our optimization, which is very close 
to minimizing the total PTA noise power, can be different 
from optimization for other types of GW detectors e.g. fre- 
quentist detectors (Verbiest et al..,2009.: .Yardlev et al..,2011, ) 



or the Bayesian detector 



, Haasteren et al.H2009l . |2011^ 

In fact, as already shown bv lBurt et al.l ( 20111 ). one can get 
a different optimal observation schedule, when focusing on 
single source detection. We have ignored the information of 



historical non-overlapping data in our algorithms, because 
the non-overlapping data has very limited contributions to 
the cross-power of GW signals, although these data can be 
important to constrain the upper limit of GW amplitude 
(|van Haasteren et al.l [2OI1I ). Furthermore, the PTA offers 
an opportunity to investigate much broader topics, e.g. in- 
terstellar medium effects, time metrology, and so on. This 
paper, thus, by no means claims that our objective func- 
tion is the only one we should pursue. However our basic 
framework of optimization, constraints and related numer- 
ical techniques will be the same for other detectors. It is 
straightforward to generalize our method to these detectors. 
For Bayesian detectors, there is no analytical expression for 
the detection significance at present. The Baysian detectors 
are usually computationally expensive, which makes the op- 
timization difficult at this stage. 

Figure [5] shows that increasing the levels of red noise 
does not decrease the saturation level of the detection sig- 
nificance, i.e. the detection significance at large GW ampli- 
tude. This seems t o contra dict the conclusion of ljenet et al.l 
(2005.) and Verbiest et al.l l|2009i) . where the red spectrum 
of the GW limits the saturation level. As shown in equa- 
tion (|15p . the term limiting the saturation level is due to 
the term {l + HC^e))yly in 'M, which is the non-zero cor- 
relation between GW signals of two different pulsars. Since 
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the red intrinsic noise of pulsars, unlike the GW induced 
signal, are uncorrelated, it is natural that they do not limit 
the saturation level. 

In this paper, we use a phenomenological model to de- 
scribe the noise component, which is a superposition of a 
telescope-time dependent white noise component and an- 
other red noise component independent of telescope time. 
The reason for using such a phenomenological model is to 
use the observational information and to introduce mini- 
mal theoretical assumptions. Our white noise term contains 
both the radiometer noise and pulse jitter noise. Although 
the radiometer noise is the current bottleneck in timing ac- 
curacy of most M SPs ( aoG~^ 3> crj), the pulse jitter noise 
(|Liu et al.l l201lh . can be a potential limitation for future 
single dish high gain telescopes. Similarly, red noise can be 
another limitation for the long term timing accuracy. De- 
tailed studies on the pulse jitter noise and red noise proper- 
ties and related mitigation algorithms will be useful for the 
future prospects of GW detection. 

We assumed that the noise sources are not corre- 
lated among pulsars. This may be valid for all the noise 
of astrophysical origin, although the clock error can be 
an identical noise am o ng all pulsars jHogan fc Ree^ 1 19841 : 
iFoster fc BackeJll990l : lManchesteilll994l : lTintdl201lf ). The 
clock error may also introduces correlations in TOAs from 
different telescopes, since observatory clocks are usually syn- 
chronized. H owever, than ks to the red spectrum of most 
clock errors (|Riehlell200i l. one can completely remo ve the 
noise using simultaneous differential measurements (|Tintd 
Furthermore, as we argue in Appendix[El such a com- 
mon noise source can be significantly suppressed by post 
processing, if each session is compact within the time scale 
of days. In this way, our assumption is justified that different 
noise sources are not correlated among pulsars. 

Our optimal observation strategy is for allocating tele- 
scope time among pulsars. This only affects the white noise 
related part ( ^■'B terms) in GW detection significance (equa- 
tion (|15|) ). In fact, one can also specify the epoch of each ses- 
sion to minimize the effect of the red noise related part ( 
terms). Since these ''■'A terms ca n be more effectively re - 
duced using whitening techniques l|jenet et al.ll2005l . |2006| ). 
we do not optimize the epoch of each session to avoid sam- 
pling artifacts and to reduce the complexities in observation 
management. The discussions for the frequency of observa- 
tion sessions are omitted, since it is not bounded in terms of 
optimization, i.e. observing more frequently sessions simply 
increase the sensitivity. 
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Figure Al. The correlation coefficient for 128 regular samples of 
a red noise power-law. For illustration purposes the spectral index 
is -2 and we removed the fitted parabolic term in the signal s(ti). 
The X-axis and y-axis show the index i and j, respectively. The 
colors indicate values of the correlation coefficients. The diagonal 
elements of the plot are not equal to each other. This clearly 
show that the polynomial fitting breaks the stationarity of the 
red noise. 



APPENDIX A: PULSAR TIMING 
CORRELATION 

In this section, we calculate the cross correlation for a Gaus- 
sian random signal with a power-law spectrum. We discuss 
two main issues arising in the calculation, the effect of poly- 
nomial fitting on the cross correlation and the effects of 
spectral leakage. For a stationary continuous-in-time ran- 
dom signal s{t), which has a prescribed power spectrum of 
Sa{f), the well-known Wiener-Khinchin theorem states that 
the autocorrelation {s{ti)s{tj)} is the Fourier transform of 
the power spectral density, i.e. 



Ssif)e 



df. 



(Al) 



However due to the fitting of polynomials, the direct ap- 
plication of the above to the pulsar timing problem needs 
revision. 

In a practical pulsar timing data reduction pipeline, 
one usually uses a least-square polynomial fitting to extract 
the pulsar parameters. Such a fit is not a stationary pro- 
cess, which prevents us from calculating the correlation di- 
rectly using equation (|A1|) . In this paper, correlations are 
calculated via numerical simulations, that take the follow- 
ing steps: i) Generate a series of the sampled si gnal using 
indivi dual frequency components as described in I Lee et al.l 
l|2008 h ii) Fit the signal with a polynomial to simulate the 
effects of fitting the pulsar rotation frequency and its deriva- 
tive, iii) Calculate the required cross correlation, iv) Repeat 
steps (i), (ii), (iii) and average the cross correlation, until the 
required precision is attained. In this paper, the correlations 
are calculated to a relative error of 1%. From the numerical 
results in Figure lAll the polynomial fitting clearly breaks 
the stationarity of signals. 

The other issue is the spectral leakage. A red noise sig- 
nal with a steep spectrum introduces leakage from low fre- 
quency components into high frequency components. For a 



red noise signal with a spectral density of 5's(/) ~ /~^, the 
amplitude for those signal components within the frequency 
range [/, / -I- Sf] is f~^^^Sf^^^ . The waveform of this com- 
ponent will be sinusoidal, i.e. s{t) ~ f~l^/'^Sf^^'^ exp{2nfit). 
At the low frequency limit, where ft <^ 1, s{t) can be ap- 
proximated using Taylor series 



sit) ~ r'^'Sf^' eM2-^ft) - Sf^- ^ 



(A2) 

If we fit the waveform with a n degree polynomial, we effec- 
tively remove the leading terms up to j"+3/2-/3/2 Qig^rly, 
if n + 3/2 — /3/2 ^ 0, the signal amplitude goes to infinity, 
when / — >■ 0. Thus there will be no low frequency cut-off 
in the spectrum, even for data with a finite length. In this 
case, the low frequency components are always dominant. 
To guarantee that the low frequency cut-off rises naturally 
(regularize the signal), we need to fit the time series to a 
polynomial with degree n > /3/2 — 3/2. For example, the 
SMBH GW background introduces a pulsar timing signal 
with a spectral index of /3 = 13/3, hence we need at least 
n = 1, i.e. fitting with a first order polynomial (a straight 
line) , although the spectral leakage from low frequency com- 
ponents is dominant. 

We now take a closer look at the effects of the fitting on 
the RMS level of a signal with a power-law spectrum given 
by 



Ss(/) = So/-'',with/G [/l,(X)), 



(A3) 



where the /l is the lowest frequency cut-off. So is the power 
spectrum value at the unit frequency. Using the data length 
(T) as the temporal unit, the RMS level of such signal is 



rl-,8 



(A4) 



where /o,l = T/l is the dimensionless lowest cut-off fre- 
quency. 

The polynomial fitting to regular sampled data using 
a ^■^-norm is equivalent to the following linear time-variant 
filter 

71 

h{ti,t2) = Y,i'2l + l)Pi{'2ti - l)Pi{2t2 - 1) , (A5) 
;=o 

where ti,t2 € [0, 1] are dimensionless time, n is the order of 
the fitting polynomial, and Pi{-) is the Z-th order Legendre 
polynomial. Denoting the signal as s{t) and the polynomial 
fitting of such a signal as so{t), we have 

So{t) = / h{t,t2)s{t2)dt2, (A6) 

Jo 

and the residual is s'{t) = s{t) — so{t). 

The average RMS level of post-fitting residual becomes 



dts'ity 





One can show that 

»1 /■! " 



1 



h{t, ti)h{t, t2)s{ti)s{t2) dtdtidt2 . 

(A7) 



JO 



^(2Z-fl)P,(2ti-l)P,(2t2-l)C(ti-t2)dtidi2, 

(A8) 
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where C(ii — t2) ~ {s(ti)s(t2))- For the case of a power-law 
spectrum, we have 



1-/3 



C{t) = S'oT" 



-1 /o,L 



/3- 



tF2 



1 3-/? 2 2 2 



+r''-'(27r)''~'r(l-/3) 



.1/3-1 



(A9) 



where r is dimensionless time in units of T, 1^2 (■) 
is the generahzed hypergeometric function (see also 
Ivan Haasteren et al for the series presentation), and 

r(-) is the gamma function. 

Integrating equation (|A8|) . one reads 



^ _(27r)^(-l)'=+"/,fL+'"''(l + n)r(fc)sin(7r/3) 
^ (1 + 2k 



+ 



/3)r(2 + 2fc)r(fc - n)r(2 + k + n) 



r( ^+\"+^ )r(i + /3) 



(AlO) 



Clearly, if 2n + 3 ^ /3, we have 



lim a" = ^oT 



/3-I9/3-2 



2''-^(l+n)7r'° 



_^ r(3M)r(^)r(A±g) 
r(2±%±^)r(i + /3) 



(All) 

i.e. the RMS a'^ is regularized to be a finite value for 
/o,L — > 0, which confirms our previous estimation. Usually 
pulsar spin and spin-down are subtracted from the data. 
This corresponds to the case of n = 2. For GW induced sig- 
nal, we have /3 = 13/3 and So = ftc(yr"^)(127r2)- Vr"*''^ 
which gives 

.'^ ^ ,g(,,-.)3-(2.)-/-rW3r(|) 

^ 27- 5- 72 • 11-13 ^ ' 

(A13) 



2.55 X 10-^ft,V^°/Vr-*'" . 



This is identical to the results of Ivan Haasteren fc LevinI 
l|2012t l. 

It should be also noted that for such case (/? = 13/3 
and So = /!.c(yr"^)(127r^)~ Vr"'*''^) equation (fXil) becomes 



2.53 X W'-'hiT 



'i,2rpW/:i r-lO/3 



/o,L ' ^ yr 



-4/3 



(A14) 



In this way, estimating RMS value of fitted signal using 
equation (|A4|) is accurate enough for practical purposes, 
given data length is adopted as the effective cut-off fre- 
quency, i.e. to use /o,l ~ 1. 



APPENDIX B: GW DETECTION 
SIGNIFICANCE 

In this section, we calculate the GW detection significance. 
The expected value for the detection significance (S) de- 
pends on Ec and {'-'c) as shown in equation (|12|) . We deter- 
mine the { '-'c) first. From equation 0, we have 



^ m 

c) = -J2{'Rk'Rk} = alHre). 



(Bl) 



To determine Ec, we need {'■'c'^), which is calculated in 
a similar fashion such that 



^ , m 771 



(B2) 



After using the correlation relation equatio n eq ua- 
tion ([6]), and performing the Wick expan; 
calculate higher momentum, we have 



m (Mt, eq i 



to 



(B3) 



fe=l fc'=l 
The Ec is then 



Ec 



\ \ * — J pairs 



y * — J pairs 

with which one can derive equation (fT5)l . 



(B4) 



APPENDIX C: OPTIMIZATION USING 
LAGRANGIAN MULTIPLIER 

In section O we describe the optimization technique us- 
ing variable transformation and numerical optimization. In 
this section, we introduce another method solving the con- 
strained optimization problem directly. As an example, we 
present the technique for the single telescope situation, 
which is also readily generalized. 

The optimization problem for the single telescope case is 
to search the minimal value of yC = '-'B under the 

f ^ 1 — J pairs 

constraint that r = "^27^1 Using a Lagrangian multiplier 
A, we can re-cast the optimization problem to optimize the 
jC' , where 



(CI) 



i — j pairs \ i— 1 / 

where V = (Vg^^^ + 'crj)a~^ and 'k ^ 1 + 'a'}a-'\ The 
minimization of jC' can be found by 

d£' 

dC 

dX 

which give 



= 0, 
= 0, 



(C3) 
(C4) 



^ E '^^i E ^-^0, (C5) 



and 



(C6) 



It is easy to check that equation HC5|) and (|C6|) can be 
solved using the following recipe: 

1. Guess an initial value for V. 

2. Update V with a newer value using 
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(C7) 



3. Repeat step 2, until the required precision is achieved. 

One can monitor the change of V for each iteration 
until it converges to the necessary precision. The initial value 
for the iteration is determined from the strong-signal limit, 
i.e. *g — >■ 0, where the iteration equation (|C7|I reduces to a 
solution 



+ 0{q) . 



(C8) 



It is worthwile noting that, in fact, the iteration process 
(equation (IC7|) ) will not change the results very much from 
the initial value equation (|C8|I . This is due to 



' +o{- 



Er=i/^ 



(C9) 



where 'Q — ^af^Q^^ + a^. Clearly, the initial value equa- 
tion (|C8[) we use is already a good approximation to the 
optimal observation strategy. 



APPENDIX D: THE CONDITION OF 
EXISTING SOLUTIONS TO 

In this section, we investigate the conditions to be satisfied 
such that the following equations have solution Vi, for given 
Ve, G^, 'O^, and t„, 



iVtol 

Vc = ^ gI 'p^ 'Ou 

v = \ 
N 

i=l 

'P. > , 



(Dl) 



(D2) 
(D3) 



where i is 1 . . . A'^, v is l...A'toi, and Vc satisfies equa- 
tion (p8)) . 

We want to prove that equation (|D2|) . and (|D3|) 

have solutions 'P^, if and only if, for any i, the following 
condition is satisfied. 



(D4) 




Figure Dl. The illustration for the accessible region of the sum- 
mation of vectors. Here Xi, X2, X3 are the coordinates. Because 
of the we choose, the vector vi is constrained to the region 
Ai (the plane constrained to the shaded triangle) and the vector 
V2 is constrained to the region A2 (the line segment). The acces- 
sible region for the summation (^3 = vi + V2) of two vectors is 
clearly the ^3, i.e. a plane with extra constraints. 



part. If 'P,y is the solution, then 



J2 ^2 n 'o. < ■ 



(D5) 



where the second step is due to the constraint of equa- 
tion (|D2p and (|D3|) . ii) The 'if part. It is easy to note that, 
under the constraints of equation (|D2|) . any linear combi- 
nation of the column vectors, e.g. Pi = c,^ ''Oi,, be- 
longs to a hyperplane 5, because Eili Pi ~ Ei/ '''vCv Due to 
equation (ID3|l , only part of the hyperplane S is accessible to 
the vector Pi, i.e. Pi is constrained by ^ /3i ^ E^ c^r^ *Oy 
in the S. Denoting such an accessible region as A, and let 
c„ =Gv- Clearly, if the vector Vc is in the accessible region 
A, there exists a solution to equation (|Dip . which satisfies 
both equation HD2[I and (|D3|I . From equation (|28|l . we know 
Ve < E^i"! ^^Ti' 'Oi/, thus the vector Ve belongs to the 
accessible region A, and the solution exists. 

A graphical illustration for the condition is given in Fig- 

1 1 

ure lDll where we choose ^Oi, 



The proof contains two steps as follows: i) The 'only if 



APPENDIX E: COMMON NOISE MITIGATION 

Clock errors and other common noise sources are harmful to 
PTA observations. Tinto (201Jj) has shown that a simulta- 
neous differential measurement of a pulsar TOA completely 
removes the clock error. Instead of the requirement for si- 
multaneous observations of multiple pulsars, we argue in this 
section that one can still remove most of the clock errors 
in post processing without simultaneity, if each observation 
session is compact enough. 

Suppose the pulsar timing signal contains an identi- 
cal (clock) noise n^it) with a red spectrum. A subtraction 
between the timing signals of two different pulsars at two 
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close epochs will remove most power of the identical noise 
component. One can check this by looking at the residuals 
(i.e. nc{t) — nc{t + A)) of the identical noise after the sub- 
traction. It is easy to show that the power spectrum of the 
residual Sn{f) becomes 



where Sc(f) is the noise spectrum of nc, A is the time dif- 
ference between the two epochs and is thus roughly the time 
span of one observing session. The clock noise Sc{f) domi- 
nates at low frequencies with time scales of ten years, thus 
we can check the residual at such frequencies, where the 
residual power spectrum becomes 



Thus, if we keep each observation session compact within a 
few days, the clock error can still be significantly (factor of 
~ 10®) suppressed by post processing, and we do not need to 
worry about such common noise for planning an observing 
schedule at this stage. 



^n(/)=Sc(/)[l-cos(27r/A)], 



(El) 




(E2) 



